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^ (54) Title: ALTERATION OF RESTRICTION ENDONUCLEASE SPECIFICITY BY GENETIC SELECTION 

2 (57) Abstract: Methods and compositions are provided for altering the DNA recognition and cleavage characteristics of an endonu- 
clease without prior knowledge of the endonuclease's three-dimensional structure and/or amino acid residues responsible for activity 

Q and/or specificity. Methods include subjecting a mutagenized endonuclease gene library to a genetic selection in prokaryotic cells 
which tolerate the expression of mutated endonuclease and where the endonuclease is active and determining the altered recogni- 
tion-site specificity for the endonuclease. 
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ALTERATION OF RESTRICTION ENDONUCLEASE 
SPECIFICITY BY GENETIC SELECTION 

BACKGRQUN P OF THE INVENTION 

Type II restriction endonucleases are a class of enzymes 
that occur naturally In bacteria where they serve as host 
defense systenns, functioning to prevent infection by foreign 
DNA molecules such as bacteriophage and plasmids that would 
otherwise destroy or parasitize them. In this defense system, 
foreign DNA is restricted (cleaved) while host DNA Is protected 
due to modification of the recognition sites by a cognate DNA 
methyltransferase. This relationship with a DNA methylase 
ensures that the endonuclease maintains an extremely high 
degree of specificity. In the course of evolution, any variants 
with non-cognate restriction activity are subject to strong 
selection pressure In the form of DNA damage encountered by 
the bacterial cell. 

The remarkable substrate specificity of restriction 
endonucleases has contributed greatly to the biotechnology 
revolution. Purification of restriction endonucleases from 
bacteria allows these enzymes to be used In numerous 
laboratory applications from gene cloning to mutation detection. 
Type II restriction endonucleases typically recognize a DNA 
sequence of 4-8 base pairs (bp). Of the greater than 3000 
enzymes characterized so far, 228 distinct substrate specificities 
have been identified. (Roberts and i^acells, Nucl. Acids Res. 
29:268-269 (2001)). The substrate specificity of a restriction 
endonuclease usually involves single site recognition (e.g. 5'- 
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AGATCT-3') However, a relatively common feature is recognition 
of a degenerate sequence. For example, Bstfl recognizes 5'- 
RGATCY-3' (where R = A or G and Y = C or T). Recognition of a 
degenerate sequence often limits the utility of a restriction 
enzyme in laboratory applications since cleavage frequency is 
excessive. Statistically, an enzyme recognizing a 6-bp sequence 
cleaves every 4096 bp while an enzyme recognizing 5'-RGATCY- 
3' cleaves every 1024 bp on average in a non-biased genome. 
Restriction endonucleases that cut infrequently (e.g. 8-bp 
cutting enzymes that cleave every 65,536) are rarely found in 
nature. Therefore, many engineering efforts are focused on 
creating less frequent cutters out of existing 6-bp cutters. 

I^ore than ten years of endonuclease engineering has 
resulted In only limited success in altering the substrate 
specificity of an existing restriction endonuclease. One important 
example is an attempt to engineer an 8-bp cutting enzyme from 
EcoRV (5'-GATATC-3') by using rational protein design based on 
the high resolution structures of EcoRV complexed with 
alternate 8-bp substrates (Norton and Perona, J. Biol. Chem. 
273:21721-21729 (1998)). In this case, rational protein design 
pertained to creating one or more specific amino acid 
substitutions by site-directed mutagenesis of the cloned gene 
fragment. A conclusion of this effort was that the determinants 
of altering substrate specificity are difficult to predict even after 
crystallographic analysis of an endonuclease/DNA substrate 
complex (Lanio, et al.. Protein Eng. 13:275-281 (2000)). The 
most promising EcoRV variant was derived from a semi-rational 
approach where twenty-two amino acid residues were chosen 
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for randomization based on examination of tlie tliree- 
dimensional structure. Clones of interest were selected by in 
vitro analysis of cleavage activity and specificity. From this 
effort, a triple mutant was identified wliich exhibited a 25-fold 
higher rate of cleaving EcoRV sites flanked by AT rather than GC 
base pairs. (Lanio, et al., J. Mo/. Bioi. 283:59-69 (1998)). 

Many other studies have been conducted to investigate 
and possibly alter the substrate specificity of the restriction 
enzymes BamHl (Dorner and Schildl<raut, Nucl. Acids Res. 
22:1068-1074 (1994), Dorner, et al., J. Mo/. Bioi. 285:1515- 
1523 (1999), Whital<er, et al., J. I^ol. Bioi. 285:1525-1536 
(1999), Newman, et al.. Science 269:656-663 (1995), Newman, 
et al.. Nature 368:660-664 (1994)) and EcoRI (Ivanenko, et al., 
J. Biol. Chem. 379:459-465 (1998), Heitman and l^odel. 
Proteins 7:185-197 (1990), Heitman, Bioessays 14:445-454 
(1992), Muir, et al., 3. t^oi. Bioi. 274:722-737 (1997), Flores, et 
al.. Gene 157:295-301 (1995)). Again, structure- based rational 
or semi-rational design approaches were employed with no 
absolute change of specificity reported. To date, a total of 
twelve structures of restriction enzymes have been determined 
(Pingoud and Jeltsch, Nucl. Acids Res. 29:3705-3727 (2001)). 
From these structures, it Is becoming clear that substrate 
recognition does not adhere to a distinct set of rules. 
Consequently, the likelihood of engineering novel substrate 
specificities into existing endonucleases by purely rational 
design methods remains low. Furthermore, protein structure 
determination remains to be a costly and time-consuming 
endeavor. 
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Consequently, a rapfd and more successful method of 
endonuclease engineering is required to identify amino acid 
substitutions responsible for altering substrate specificity 
without the requirement of protein structural information. 

SUMMARY O F THg INVENTION 

The present invention provides a novel method for altering 
the DNA recognition and cleavage specificity of an 
endonuclease. Prior I<nowledge of the endonuclease three- 
dimensional structure and/or knowledge of the amino acids 
responsible for substrate recognition are not required. 

In an embodiment of the Invention, a method Is provided for 
altering an endonuclease recognition site specificity, that includes: 
subjecting a mutagenlzed endonuclease gene library to a genetic 
selection in a population of prokaryotic host cells expressing one or 
more non-cognate DNA methyltransferases, wherein the genetic 
selection selects for viable cells in the population; and identifying 
whether the viable cells express an active mutated endonuclease 
with an altered recognition site specificity. 

The mutagenized endonuclease gene library may be formed 
by: error prone PGR, chemical mutagenesis, assembly PGR, DNA 
shuffling, in vivo mutagenesis, cassette mutagenesis, recursive- 
ensemble mutagenesis or exponential ensemble mutagenesis. 
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The endonuclease activity may be attenuated by modifying tlie 
mutagenized endonuclease gene library using modification means 
selected from: creating an amber codon within the open reading 
frame; creating an opal codon within the open reading frame; 
changing the start codon to GTG orTTG; mutating the RBS 
sequence or utilizing a T7 expression vector wherein the host cell is 
T7 RNA polymerase negative. 

In addition to the above, viable prokaryotic host cells may be 
pooled and plasmid DNA isolated from the cells where the plasmid 
DNA encodes mutagenized endonuclease genes from the library and 
transforming the plasmid DNA into a population of indicator cells for 
detecting DNA damage. The mutagenized endonuclease genes may 
be subjected to repeated genetic selections in the population of host 
cells described above and in the population of indicator cells where 
the genetic selection in the population of Indicator cells Includes a 
first population of indicator cells lacking non-cognate methylase and 
a second population of indicator cells expressing the non-cognate 
methylases. 

The sequence of the altered recognition site can be 
determined where the altered specificity for the site may be relaxed 
recognition-site specificity, increased recognition-site specificity or 
alternate recognition-site specificity. 

In a further embodiment of the invention, a method is 
provided for altering recognition site specificity of an endonuclease, 
that includes: creating a mutagenized endonuclease gene 
expression plasmid library from a target endonuclease gene and 
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transforming prokaryotic cells with the mutagenlzed library, wherein 
the prokaryotic cells express one or nnore non-cognate 
methyltransferase; selecting prokaryotic cells which are viable after 
transformation and isolating plasmid DMA from the viable cells; 
determining whether the isolated plasmid DNA encodes an active 
endonuclease by transforming the plasmid DNA Into DNA damage 
indicator cells; screening the plasmid DNA encoding the active 
endonuclease for altered specificity; and optionally repeating the 
above protocol to obtain the endonuclease with altered recognition- 
site specificity. An example of the above method is an endonuclease 
having an altered recognition site derived from BstYI or Notl. 

In a further embodiment of the Invention, a method is 
provided for modifying recognition-site specificity of an 
endonuclease from a parent specificity to a target specificity, that 
include obtaining a sequence for a plurality of mutated 
endonucleases obtained by the methods described above to 
determine the mutation for each mutated endonuclease; and 
mutating a gene encoding the endonuclease to produce one or 
more of the mutations Identified above so as to provide the target 
specificity for the endonuclease. 

BRIEF DESCRIPTION O F THE DRAWINGS 

Figure lA is an outline of the genetic selection procedure. 
Rgene*, mutated endonuclease gene library; Mgene, non- 
cognate DNA methylase gene. 
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Flgure IB is an outline of tlie genetic selection procedure 
as applied to isolate variants of BstYI witii AGATCT specificity. 

Figure 2 is the nucleotide (SEQ ID NO:l) and amino acid 
sequence (SEQ ID NO:2) of BstYI endonuclease. 

Figure 3 is the DNA cleavage pattern of clone NNl vs. 
wild-type BstYI assayed on 0.5 pg substrate pUCAdenoXba. 
Lanes 1-4 represent cleavage with 0.5, 1.0, 2.0 and 4.0 units 
wild-type BstYI for 1 hour, respectively. Lanes 5-9 represent 
cleavage with an excess of clone NNl for 30, 60, 90 and 120 
min, reispectively. Lane 9 is complete digestion of 0.5 pg 
pUCAdenoXba with Bg/II to show the pattern of restriction at 5'- 
AGATCT-3' only. All reactions were incubated at 55**C. 

Figure 4 is the DNA cleavage characteristics of done NNl 
vs. wild-type BstYI assayed on radiolabeled DNA substrates (60 
bp) each containing one of the three unique BstYI recognition 
sites. 

Figure 4A is cleavage of the 5'-AGATCT-3' site. Figure 4B 
is cleavage of the 5'-GGATCC site. Figure 4C is cleavage of the 
5'-AGATCC-3' site (bottom strand is 5'-GGATCT-3'). In each 
part, lane 1 is the substrate only. Lanes 2-5 represent cleavage 
with wild-type BstYI for 2, 4, 6, and 10 min, respectively. Lanes 
6-9 represent cleavage with clone NNl for 5, 10, 20, and 30 
min, respectively. All reactions were incubated at 60**C. 



wo 03/060152 PCTAJS03/00542 

• -8- • 



DETAILED DESCRIPTION OF THE I NVENTION 

Definitions 

For convenience, certain terms employed in the 
specification, exannples and appended claims are collected here. 

"'Non-cognate methylase" refers herein to a DNA 
methyltransferase possessing a substrate specificity that does 
not protect a target recognition site(s) of a given endonuclease. 

"'I^utation" refers herein to any gene alteration including 
DNA rearrangement; nucleotlde(s) substitution, addition and/or 
deletion that results in expression of an endonuclease 
possessing an altered amino acid sequence. 

Active" refers herein to an endonuclease capable of 
binding a recognition slte(s) or capable of binding and cleaving a 
recognition site(s). Binding activity may be determined, for 
example, by an electrophoretic mobility shift assay (Thompson 
and Landy, Nucl. Acids Res. 16:9687-9705 (1988)), filter 
binding assay (Zhenyu Zhu, New England Biolabs) or may be 
reported by an Indicator strain capable of sensing site-specific 
DNA binding. Cleavage activity may be determined in vivo by an 
indicator strain capable of reporting DNA damage or in vitro by 
incubating cell extract or partially purified endonuclease with an 
appropriate DNA substrate (Xu and Schildkraut, J.Bioi. Ctiem. 
266: 4425-4429 (1991)). 



wo 03/060152 PCT/US03/00542 



"Recognition site" refers lierein to an uninterrupted or 
interrupted DNA sequence to winicli an endonuclease is 
preferentially bound. DNA cleavage by the endonuclease may 
occur witiiin or outside of the recognition sequence. 

"Tolerated" refers herein to maintenance of cell viability 
through DNA methylation protection. 

"Attenuation" refers herein to intentionally decreasing the 
in vivo DNA damaging effects of a given endonuclease gene or 
gene library. Attenuation may occur at the level of transcription, 
translation, or by alteration of the endonuclease specific activity 
by mutagenesis. 

'"Altered specificity" refers herein to any measurable 
endonuclease activity which differs from the wild-type or parent 
endonuclease. Altered specificity includes relaxed specificity, 
increased specificity or alternate specificity. 

"Relaxed specificity" refers herein to increased promiscuity 
of an endonuclease with respect to its recognition site. 
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^'Endonuclease gene library" refers herein to a collection of 
genes where a majority of the library members are unique with 
respect to their nucleotide sequence (where a single nucleotide 
difference between two members qualifies as being unique). The 
library may be derived from one or more endonuclease gene or 
each member may be artificially constructed to possess the 
general characteristics of an endonuclease gene sequence. 

In an embodiment of the invention, we provide a method 
for the genetic selection of endonuclease variants possessing 
altered substrate specificity as compared to the parent 
endonuclease. The parent endonuclease may be the product of a 
wild-type endonuclease gene isolated from nature or the parent 
endonuclease may Itself be a variant isolated by any other 
means. For example, in the development of an endonuclease 
possessing an altered recognition site specificity, an important 
first step may be to isolate a variant with a relaxed specificity. 
Once the new recognition sites are determined, one or more 
non-cognate DNA methylases can be employed in vivo to protect 
the host genomic DNA and thus serve as an enabling factor In 
the genetic selection procedure of the present Invention. The 
selection of a relaxed specificity variant may be accomplished by 
transforming a mutagenized endonuclease gene library into a 
DNA-damage indicator strain protected by a cognate DNA 
methylase (see example 2 and Heitman and Model, EMBO J. 
9:3369-3378 (1990)). 

In an embodiment of the invention, we subject a mutated 
endonuclease gene library to a plurality of genetic selections in 
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E. coll or other suitable prokaryotic host. While the method may 
be conducted with a minimum of two genetic selections, three or 
more selections are preferred to reduce false positives. The 
procedure yields active variants possessing a high likelihood of 
having one or more mutations which are either necessary or 
informative for altering the substrate specificity of the 
endonuclease. The desired substrate specificity is determined by 
protecting the host DNA with one or more non-cognate DNA 
methyltranferase. The host DIMA may be methylated at an 
alternate recognition site(s) to isolate variants with an alternate 
cleavage preference or the host DNA may be methylated at one 
or more sub-sites to Isolate variants with cleavage preference 
towards one or more sub-sites. 

During the in vivo selection process, host bacterial cells 
carrying endonuclease variants with DNA cleavage activity 
outside the spectrum of host DNA methylation are eliminated or 
at least are strongly selected against due to reduced growth. 
Therefore, this approach Involving laboratory genetic selection 
steps Imposed upon a randomly mutated endonuclease gene 
may be considered to be analogous to the evolutionary selection 
process dependent upon native host protection by a cognate 
DNA methyltransferase. 

A protocol we have developed to achieve in vivo selection 
process (see Figure 1) Includes one or more of the following 
steps: 
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(I) generating a mutated endonuclease library within 
an expression vector or plasmid. 

(li) introducing tlie endonuclease library into host cells 
pre-modifled with a non-cognate pattern of methylatlon. The 
transformed cells are plated on media containing an inducer 
molecule to maximize elimination of host cells harboring 
endonuclease variants with DNA cleavage activity outside the 
spectrum of host DNA methylatlon. 

(ill) pooling survivors and plasmid DNA from the cells. 
Clones expressing active variants are Isolated by transforming 
the endonuclease plasmids only Into an Indicator strain for DNA 
damage, preferably ER1992 (U.S. Patent No. 5,498,535). The 
transformed cells are plated on media void of the inducer 
molecule to avoid lethal levels of DNA damage, as the indicator 
strain is not protected by methylatlon (or only partially 
protected) in this step. The media does contain the substrate X- 
gal to allow Individual colonies to report significant levels of 
cellular DNA damage upon Induction of the SOS repair response 
and expression of a dinDlr.lacZ gene fusion. 

(iv) active endonuclease clones may be isolated by 
culturing individual dark blue colonies for a short time at a low 
temperature and preparing plasmid DNA from these cultures. 
Alternatively, dark blue colonies are pooled and plasmid DNA is 
prepared without culturing In order to maintain the Integrity of 
active clones. 
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(v) the individual plasmid isolates (or pooled plasmid 
DNA) can then be introduced into the DNA damage indicator 
strain which is pre-modified with the same pattern of 
methylation as in step (ii). The emergence of white colonies on 
X-gal media at Sy^C indicates that the endonuclease variant 
may be exhibiting cleavage preference towards the site(s) 
protected by methylation. 

The methods described herein are effective in altering 
cleavage specificity of thermophilic restriction enzymes as these 
enzymes are less active at 30°C-37°C and are better tolerated 
by DNA damage indicator cells not protected by methylation. 
Yet, endonucleases from mesophilic organisms may be 
engineered by this method by first attenuating the endonuclease 
activity and/or expression level as described in Examples 2 and 
3 of the present invention. 

We provide examples of a stringent selection method where 
an estimated 10^ variants can be rapidly screened In one round. 
In Example 1, this was applied to Increase the substrate 
specificity of SstYI (5'-RGATCY-3') to single site recognition (5'- 
AGATCT-3'). After one round (three selection steps), forty-five 
clones were analyzed in vitro for activity and specificity. Of 
those forty-five clones, two variants were found to exhibit a 
preference for cleavage of 5'-AGATCT-3' over other SstYI sites. 
By combining single mutations present in each of these clones, 
a superior clone designated NNl was Isolated. NNl displays a 7- 
fold preference for cleavage of 5'-AGATCT-3' relative to 5'- 
AGATCC-3' or 5'-GGATCT-3' and cleavage of the 5'-GGATCC-3' 
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Site Is not detected (Figure 4). Embodiments of the inventive 
metiiod are provided below and comprise the following steps, 
although as the skilled artisan will appreciate, modifications to 
these steps may be made without adversely affecting the 
outcome: 

Although not essential, the skilled artisan may choose to 
consult any available gene homology or protein structural 
information that pertains to the endonuclease under study or to 
the appropriate family of endonucleases. This information may 
be useful in determining and designing the method of 
mutagenesis. 

1) A given endonuclease gene is mutagenlzed by 
methods generally known in the art including any of: 

(a) Error-prone PGR (Leung, et al.. Technique 1:11-15 
(1989), Cadwell and Joyce, PCR Methods Applic. , 2:28-33 
(1992)). 

(b) Hydroxylamlne, sodium bisulfite or any other 
chemical mutagen treatment. (Sambrook, J., et al.. Molecular 
Cloning: A Laboratory Manual, 2"^ ed. (1989)). 

(c) Oligonucleotide directed mutagenesis, preferably the 
overlap extension PCR mutagenesis method (Morrison and 
Desroslers, Biotechnlques 14:454-457 (1993)). 
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(d) Assembly PGR. The term '"assembly PGR" refers to a 
process that involves the assembly of a PGR product from a 
mixture of small DNA fragments. A large number of different 
PGR reactions occur in parallel in the same vial, with the 
products of one reaction priming the products of another 
reaction. 

(e) Sexual PGR mutagenesis. The term '"sexual PGR 
mutagenesis" (also known as "'DNA shuffllng'O refers to forced 
homologous recombination between DNA molecules of different 
but highly related DNA sequence in vitro, caused by random 
fragmentation of the DNA molecule based on sequence 
homology, followed by fixation of the crossover by primer 
extension in a PGR reaction. (Stemmer, Proc, Natl. Acad. Sa'., 
USA 91:10747-10751 (1994)). 

(f) In vivo mutagenesis. The term "/n vivo 
mutagenesis" refers to a process of generating random 
mutations In any cloned DNA of Interest which involves the 
propagation of the DNA in a strain of E. coli that carries 
mutations In one or more of the DNA repair pathways. These 
mutator strains have a higher random mutation rate than that of 
a wild-type strain. Propagating the DNA in one of these strains 
will generate random mutations within the DNA. (Long-McGie, et 
al., Biotectinol. Bioeng. 68:121-125 (2000)). 

(g) Gassette mutagenesis. The term "'cassette 
mutagenesis" refers to any process for replacing a small region 
of a double-stranded DNA molecule with a synthetic 
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oligonucleotide cassette that differs from the native sequence. 
The oligonucleotide often contains completely or partially 
randomized native sequence. (Dorner, et al., J. Mol. Biol. 
285:1515-1523 (1999)). 

(h) Recursive ensemble mutagenesis. The term 
'"recursive ensemble mutagenesis" refers to an algorithm for 
protein engineering (protein mutagenesis) developed to produce 
diverse populations of phenotypicaily related mutants whose 
members differ In amino acid sequence. This method uses a 
feedback mechanism to control successive rounds of 
combinatorial cassette mutagenesis (Arkin and Youvan, Proc. 
Natl. Acad. Scl., USA 89:7811-7815 (1992)). 

(i) Exponential ensemble mutagenesis. The term 
"exponential ensemble mutagenesis" refers to a process for 
generating combinatorial libraries with a high percentage of 
unique and functional mutants, wherein small groups of residues 
are randomized In parallel to Identify, at each altered position, 
amino acids which lead to functional proteins (Delegrave and 
Youvan, Biotechnology Res. 11:1548-1552 (1993)) and random 
and site-directed mutagenesis (Arnold, Curr. Opin. Biotechnol. 
4:450-455 (1993)). 

Each of these techniques Is described in detail In the cited 
references herein incorporated by reference. 



2) The mutagenized endonuclease gene library is 
prepared and cloned Into an expression vector by standard 
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techniques. The expression vector should be capable of inducible 
over-expression and may exhibit low-level constitutive 
expression of the endonuclease library. One such vector is 
pAGR3 (New England Blolabs, see Figure 6) containing a Ptac 
promoter, the lacP' gene, a four-fold repeat of the rrnb 
transcription terminator upstream of the promoter, a Co/El 
origin of replication and an ampiclllln resistance gene. Other 
suitable expression vectors may contain an IPTG-lnducible 
promoter (P/ac, Ptac, Pfrc), or a T7 promoter. Examples of 
suitable T7 vectors Include pET21a, pETZlat (a pET21a 
derivative constructed at New England Blolabs where 4 copies of 
the rrnb transcription terminator are Inserted upstream of the 
T7 promoter) and pAII17 (New England Blolabs, see Figure 7). 
When such T7 vectors are employed, protein overexpression 
occurs In a bacterial strain carrying the T7 RNA polymerase 
gene normally regulated by lad. (e.g. ER2744 [fhuAZ lacZ::T7 
genel glnV44 el4- rfbDl? relAl? endAl spoTl? thi-1 A(mcrC- 
mrr)114::IS10^ or ER2848 [P proA+B+ lacP A(lacZ)M15 
zzf::TnlO(TetR) fhuAZ lacZ: :T7 genel glnV44 el4- rfbDl? 
relAl? endAl spoTl? thhl A(mcrC-mrr)114::IS10} New England 
Blolabs). However, other inducible expression systems and 
prokaryotic hosts may be substituted. 

3) The mutagenlzed endonuclease library may be 
introduced Into a bacterial strain pre-modlfied with a non- 
cognate pattern of DNA methylation. Preferably, pre- 
modlfication consists of transforming a bacterial strain with one 
or more plasmids carrying a methylase gene and subsequently 
preparing those cells for transformation of the endonuclease 
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library. For example, ER2502 {[fhuAZ ara-14 leu A(gpt-proA)62 
lacYl glnV44 galK2 rpsLZO endAl R(zgb210::TnlO)Tet S xyl-5 
mtl'l A(mcrC-mrr)HB101'\ New England Blolabs) may be used 
when using a vector with an IPTG-Inducibie promoter; although, 
any strain may be used so long as it is tolerant of the imposed 
DNA methylation state and It is capable of efficient chemical 
transformation and electroporation. Other suitable strains 
Include, but are not limited to, JM107, HBlOl, NM522, ER1821, 
ER2267, ER2744 and ER2848. When employing a vector with a 
T7 promoter, ER2744 and ER2848 are preferred due to the 
presence of an IPTG-inducible T7 RNA polymerase gene. Upon 
endonuclease library introduction, surviving strains are selected 
on Luria-Bertani (LB) agar plates containing the Inducer 
molecule, an antibiotic to select the endonuclease plasmid and 
an antibiotic to ensure maintenance of the methylase 
plasmid(s). 

4) Surviving colonies are pooled and plasmid DNA is 
prepared from the cells. The methylase plasmids are destroyed 
by digestion with at least two restriction enzymes. The pooled 
endonuclease library DNA is introduced into a DNA damage 
indicator strain, for example, ER1992 (NEB#907), which 
contains the dinDl:lacZ gene fusion (U.S. Patent No. 
5,498,535). Other DNA damage indicator strains may be 
substituted for the above where the reporter gene is fused to a 
promoter induced by the SOS repair response. Other DNA 
damage-inducing promoters which can be used include dInA 
(lwasal<l, et al., J. Bacteriol. 172:6268-6273 (1990)) and dinG 
(Lewis, J. Bacteriol, 174:5110-5116 (1992)). Other 
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Indicator/reporter genes which can be fused to any of the above 
promoters include alkaline phosphatase (phoA) (Hoffnnan and 
Wright, Proc. Natl. Acad. Sci. USA, 82:5107-5111 (1985)), 
luciferase (lux) (Engelrecht, Science 227:1345-1347 (1985)), p- 
glucuronidase (Metcalfe, Gene, 129:17-25 (1993)), 
aminoglycoside phosphotransferase (Ward, et al., Mol. Gen. 
Genet, 203:468-478 (1986)), and endoglucanase (Bingle, et al.. 
Can. J. Microbiol., 39:70-80 (1993)). Where the lac Z reporter 
gene system is utilized, transformants are selected at a 
predetermined temperature (for example 30°C-42°C) on agar 
plates containing an antibiotic to select for the endonuclease 
plasmid and X-gal to serve as the substrate for blue color 
development upon induction of the SOS response and 
expression of the dlnDlr.lacZ gene fusion. Blue cells are 
presumed to be carrying active endonuclease variants that 
survived the first selection in the presence of the non-cognate 
pattern of DNA methylation. 

5) Cells determined to be carrying the active restriction 
endonuclease gene are selected and cultured In order to isolate 
a minimal amount of plasmid DNA for a subsequent 
transformation. Alternatively, Indicator colonies that are positive 
for active endonuclease are pooled and plasmid DNA is prepared 
without culturing In order to maintain the integrity of active 
clones. 



6) The endonuclease clones derived from indicator 
colonies that are positive are transformed Into the DNA damage 
indicator strain now carrying the same methylase plasmid(s) as 
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in step 3 to provide the same pattern of methylation protection 
as in tlie first genetic selection. For example, transformed cells 
are plated on LB-agar containing the indicator substrate, an 
antibiotic to select the endonuclease plasmid and an antibiotic to 
maintain the methylase plasmid(s). If the non-cognate pattern 
of methylation protects the cellular DNA from variant 
endonuclease cleavage, then the Indicator response for colonies 
that are positive for the active endonuclease will not be Induced 
and colonies that are negative will result. For example, the 
emergence of white colonies of a dinDlr.lacZ strain suggests 
that the endonuclease variant displays cleavage preference 
towards the recognition sequence(s) protected by the non- 
cognate methylase(s). 

7) To further analyze the novel endonuclease, allquots 
of individual colonies may be taken for plasmid DNA Isolation 
and sequencing. In one embodiment, the allquots are taken 
from white colonies cultured at 30°-37°C for 6-16 hours. The 
remaining cells may be lysed and the resulting extract analyzed 
for endonuclease activity and specificity towards an appropriate 
DNA substrate. The DNA substrate should allow distinction 
between restriction of the original recognition sequence(s) and a 
preference for the desired recognition sequence(s). 

8) Endonuclease variants displaying a preference for 
the desired recognition sequence(s) may be sequenced to 
identify those genetic changes and amino acid substitutions 
responsible for alteration of the enzyme specificity/recognition 
site prefei:ence. 



wo 03/060152 



PCT/US03/00542 



9) Those skilled in the art of protein engineering will be 
able to evaluate the Importance of each of the genetic changes 
present in the selected clones and use this information to 
rationally design Improved endonuclease variants with desired 
enzymatic properties. The mutations responsible for alteration of 
enzyme specificity are identified by inference and/or site- 
directed mutagenesis. The mutations deemed Important for 
altering substrate specificity are combined Into one 
endonuclease clone with the objective of creating a superior 
endonuclease variant. This variant or variants selected directly 
from one round can be mutagenized further and subjected to a 
subsequent round of selection(s) until the desired variant is 
isolated. At any point In the process, the skilled artisan may 
choose to make site-directed changes to the endonuclease gene 
in order to maximize the efficiency of the selection process. 

The references referred to above and below are hereby 
incorporated by reference herein. 

The following Examples are provided to aid in the 
understanding of embodiments of the invention and are not 
intended as a limitation thereof. 
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ABfteraftDOiTD off BsSVI perffoiririniainice tto 5'-AGATCT-3' 
specific!^ by irairDcilom myltageiniesDS, gemeftoc selecttioo ainicfl 
SDlte-diiirecltecil mnitagenesis. 

The endonuclease BsfYI recognizes and cleaves all 
hexanucleotide sequences described by 5'-RGATCY-3' with 
similar proficiency. In this Example, the genetic selection 
method of the present invention was applied to isolate one or 
more BstYI variants with cleavage preference towards AGATCT. 
The sequence AGATCT (Figure IB) was methylated at the N4 
position of cytosine by the Bg/II N4-cytosine methyltransf erase 
(Brooks and Roberts, Nucleic Acids Res. 10:913-934 (1982), 
Eriich, et al., J. Bac. 169:939-943 (1987)). I^lethylation of the 
N4 position of cytosine is known to inhibit cleavage of the 
AGATCT DNA sequence by BstYI. The Bgai methylase gene 
{bglllM) from Bacillus globigii (ATCC 49670) was isolated from 
pUC-Bgllll^ (Anton, et al.. Gene 187:19-27 (1997)) by BamHI 
digestion and subsequently filled-in with Kienow fragment to 
produce blunt ends. The blunt-ended bgllll^ fragment was 
ligated into SmsI-digested/CIP-treated pSYX20 (Morgan, et al.. 
Gene 183:215-218 (1996)). This clone, pSYX-BglllM, was 
employed during the genetic selection procedure to provide 
methylation protection where indicated. 

The SstYI endonuclease gene {bstYIR) was amplified from 
pET21at-BstYIR (U.S. Patent No. 6,403,354 Bl) by error-prone 
PGR (Leung, et al., Tectinique 1:11-15 (1989), Cadwell and 
Joyce, PGR i^etfiods Applic. 2:28-33 (1992)). Two sets of PGR 



wo 03/060152 ^CT/US03/00S42 

-23- 



conditions were selected to give two levels of mutagenesis of 
the bstYIR gene (ShafikhanI, et al., Biotechniques 23:304-310 
(1997)) resulting in mutagenic libraries Al and A2. 

Al PGR conditions were as follows: 8 ng plasmid template 
per 100 \x\ reaction, 0.4 ijM forward and reverse primers, 5U Taq 
DNA polymerase, 7 mM MgS04, 0.15 mM MnCb, 0.2 mM dATP, 
0.2 mM dGTP, 1.0 mM dCTP, 1.0 mM dTTP, 10 mM KCI, 10 mM 
(NH4)S04, 20 mM Tris-HCI (pH 8.8 @ 2S°C) and 0.1% Triton X- 
100. Thermocycling parameters were: 1 cycle for 3 min @ 94°; 
15 cycles (1 min @ 94°Q 1 min @ 50°C, 1 min @ 72*»C) and 1 
cycle for 7 min @ 72«>C. A Perkln-Elmer 2400 thermocycler was 
used for ail PGR reactions. A2 PGR conditions were exactly as Al 
conditions except 30 cycles of amplification were used. 
Sequencing the 612 bp bstYIR gene of genetically selected 
clones revealed an average of 5 nucleotide substitutions within 
the Al mutagenic library and 11 nucleotide substitutions within 
the A2 mutagenic library. No frameshift mutations were 
observed in any of the sequenced clones. However, the majority 
of clones contained a stop codon, presumably allowing the SstYI 
variants to be tolerated In the presence of a non-cognate 
pattern of methylation. The amber stop codon TAG was found 
disproportionately, explaining why endonuclease activity was 
detected in each of the clones as this stop codon is suppressed 
to a significant degree due to the glnV44 locus present in both 
selection strains ER2502 (NEB#1149; New England Biolabs, 
Inc., Beverly, MA) and ER1992 (NEB#907; New England 
Biolabs, Inc., Beverly, MA). The glnV44 locus is responsible for 
glutamine incorporation at UAG via expression of a suppressor 
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tRNA*^'" molecule. The mutagenlzed bstYIR gene libraries Al and 
A2 were digested with Xbal/Xhol and each were ligated Into 50 
ng X/?aI/Sa/I-digested, CIP-treated pAGR3 to create expression 
vector libraries pAGR3BstYIR-Al and pAGR3 BstYIR- A2. Ligation 
was carried out for 16 hours at 16°C followed by heat- 
denaturing the ligase for 30 min at 65°C. The 20 |j> ligation 
reactions were drop-dialyzed against de-ionized water for 4 
hours. Then 10% of each reaction was nnixed with 40 |jL ER2502 
[pSYX-BglllM] cells prepared for electroporation (BloRad Gene 
Pulser II protocol). The cell/DNA mix was electroporated at 1.8 
kV in a 0.1 cm cuvette (BloRad) followed by addition of 1.0 ml 
SOC media. The cell suspension was incubated at 37°C for 1 
hour and plated (160 pi x 6 plates) on LB-agar containing 100 
pg/ml amplcillln, 15 MQ/ml tetracycline and 0.3 mM IPTG. The 
plates were incubated 16 hours at 37<»C and survivors on each 
plate were pooled by LB broth resuspension and plasmid DNA 
was prepared from the cells on each plate. The plasmid DNA 
from each plate was digested with BamHI, Spel and Sail to 
eliminate pSYX-BglllM from each library. After heat 
denaturation of the restriction enzymes (20 min, 80°C), an 
aliquot of each digestion mix was transformed into 20 \^\ 
ER1992, the DNA damage indicator strain. Each transformation 
mix was plated on 4 LB-agar plates containing 100 pg/ml 
ampicillin and 40 [ig/m\ X-gal. Clones from the Al expression 
library produced 4-8 dark blue colonies per plate while clones 
from the A2 expression library produced 1-4 dark blue colonies 
per plate. These individual dark blue colonies were cultured at 
30'*C for 3 hours In LB and Amp and plasmid DNA was prepared 
from each culture. The individual DNA preparations were 



wo 03/060152 



PCT/US03/00542 



-25- 



transformed into ER1992 chemically-competent cells carrying 
pSYX-BglllM. The transformations were plated on LB-agar 
containing Amp, Tet and X-gal and incubated for 16 hours at 
37°C. Individual white colonies were cultured at 37°C, induced 
with 0.5 cnM IPTG for 3 hours and assayed for endonuclease 
activity and preference for cleavage of AGATCT. The 
activity/specificity assay was conducted as follows: 

Ten milliliters of induced culture were pelleted and the 
supernatant was removed. The cell pellet was resuspended in 
1.0 ml sonication buffer: 10 mi^l Tris-HCI (pH 7.5 @ 25°C), 10 
mM B-mercaptoethanol, 0.1 mM EDTA. The cell suspension was 
sonicated twice for 25 sec using an Ultrasonics, Inc. Cell 
Disrupter. The cell debris was pelleted for 10 min at 14,000 
rpm. Two microliters of the resulting extract was added to the 
following 25 pi DNA cleavage reaction: 0.5 pg pUCAdenoXba 
substrate, Ix NEB Bstfl buffer (10 mM Tris-HCI (pH 7.9 @ 
25°C), 10 mM MgCb, 1 mM dithiothreitol), 0.1 mg/ml BSA. The 
substrate pUCAdenoXba (New England Biolabs, Inc., Beverly, 
MA) consists of an Adenovirus-2 DNA fragment from Xbal site 
10,579 to Xbal site 30,455 cloned into the Xbal site of pUC19. 



The circular substrate is 22,562 bp and contains 18 SsfYI sites, 
4 BgHl sites and 4 BamHl sites. As demonstrated in Figure 3 
(lanes 4 and 9), the digestion patterns of BstYI and 8g/II are 
distinctly different. Of 45 selected clones, 2 BstYI variants 
displayed a digestion pattern indicating a preference for 
cleavage of AGATCT. The amino acid substitutions of clone 9 
were R2opal/K49R/K87R/K133N. (R2opai = Arglnine codon at 
position 2 changed to the TGA stop codon. Tryptophan is 
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inserted at UGA at a low frequency by the tryptophan-spedfic 
tRNA.) The amino acid substitutions of clone Ale were 
Q28H/S172amber/Y176C. (S172amber = Serine codon at 
position 172 changed to the TAG stop codon). Assuming that 
glutamine was being incorporated by amber suppression, the 
codon at position 172 of clone Ale was changed to CAG and the 
clone was renamed clone 121. Site-directed mutagenesis of the 
wild-type bstYIR gene determined that the substitutions of 
K133N and S172Q or S172N are responsible for alternation of 
BsfYI specificity. The two substitutions, K133N and S172N, were 
combined in one BstYI variant that was designated NNl. 

NNl was purified from strain ER2744 [pET21at-NNl, 
pACYC-BstYIM, pCEF8] in the following manner: 500 ml cells 
Induced with 0.5 ml^ IPTG for 3 hours at 37*»C were pelleted and 
the supernatant was removed. The cell pellet was resuspended 
in 10 ml sonication buffer: 10 mM Tris-HCI (pH 7.5 @ 25°C), 10 
mM 6-mercaptoethanol, 0.1 ml^ EDTA. The cell suspension was 
sonicated 6 times for 25 sec. After sonication, 25 ml^ NaCI was 
added and the cell extract was heated at 65<»C for 30 min to 
denature E.coli proteins. The heated extract was centrifuged at 
15,000 rpm for 30 minutes and the clarified supernatant was 
retained. Glycerol was added to a final concentration of 5% 
before loading the supernatant onto a Heparln-Sepharose FF 
column equilibrated with 10 ml^ Tris-HCI (pH 7.5 @ 25°C), 25 
mM NaCI, 10 mM B-mercaptoethanol, 0.1 mM EDTA and 5% 
glycerol. Fractions were eluted with a 0.025-0.8 M NaCI 
gradient. Fractions containing >90% endonuclease as 
determined by SDS-PAGE analysis were pooled and the NaCI 
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concentration was adjusted to 0.3 M. The pooled fractions were 
allowed to flow through a DEAE-Sepharose column to 
accomplish DNA/RNA removal. The DEAE-Sepharose column was 
pre-equllibrated with 10 mM Tris-HCI (pH 7.5 @ 25°C), 0.3 M 
NaCI, 10 mM B-mercaptoethanol, 0.1 mM EDTA and 5% 
glycerol. The DEAE flow-through was dialyzed overnight at 4°C 
in 20 mM TrIs-HCI (pH 7.5 @ 25°C), 100 mM KCI, 10 mM 6- 
mercaptoethanol, 0.1 mM EDTA and 5% glycerol. Glycerol was 
added to a final concentration of 50% and the purified protein 
was stored at -20°C. NNl displays an even greater preference 
for cleavage of AGATCT than clones 9 or 121 and purified NNl 
protein maintains the same thermostability as wild-type SstYI. 

By combining the genetic selection method of the present 
invention, interpretation of the selected genetic alterations by a 
skilled artisan and site-directed mutagenesis, a BsfYI variant 
was isolated that preferentially recognizes and cleaves 5'- 
AGATCT-3' (Figures 3 and 4). NNl displays a 7-fold preference 
for recognition and cleavage of 5'-AGATCT-3' relative to 5'- 
AGATCC-3' or 5'-GGATCT-3' and recognition and cleavage of the 
5'-GGATCC-3' (SEQ ID NO:6) site is not detected. (Relative 
amounts of cleavage were calculated by quantifying the rate of 
product formation displayed in Figure 4). 
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ExamoOe 2 

SeBedtooDTi olF Motl yauants wnttlh aDtteredl specSffOdfty ffoir DWA 

■recocpmiftloini aimd cleavagie 

The recognition sequence of NotI is 5'-GCGGCCGC-3'. 
Restriction endonucleases witli 8-bp recognition sites are rarely 
found In nature. Tlierefore, development of such enzymes by 
protein engineering Is of considerable commercial interest. The 
method of the present Invention Is especially suited for this type 
of endeavor since an extensive range of DNA methylases are 
available that will modify derivations of the wild-type NotI 
recognition sequence. However, randomly choosing a non- 
cognate methylase to employ in the genetic selection of NotI 
variants imposes an unfair bias on the evolution process. 
Instead, a preliminary study was conducted to isolate a NotI 
variant with relaxed specificity. This work was modeled after a 
study of EcoRI substrate specificity where important amino acid 
residues were identified using a DNA-damage indicator strain 
expressing the cognate EcoRI methylase (Heitman and Model, 
EMBOJ. 9:3369-3378 (1990)). In the case of the NotI study, 
the EagI methylase served as the "^cognate" methylase since it 
modifies the sequence 5'-NCGGCCGN-3' and protects the host 
genomic DNA from cleavage by wild-type NotI. 

I. SsoDafcooira otF IhDoftE warSamift 44=2A 

The EagI methylase clone (pACYC184EagIM) was isolated 
from the NotI overproduction strain (NEB #816 and U.S. patent 
No. 5,371,006) and transformed into DNA-damage indicator 



wo 03/060152 



PCT/US03/00542 



strain ER1992. The wild-type notIR gene was subjected to error- 
prone PGR mutagenesis with the following conditions: 10 ng 
plasmid template pAII17-NotIR, 0.4 jjM T7 promoter and T7 
terminator primers, 5U Taq DNA polymerase, 7 ml^ l^gS04, 0.2 
mM dATP, 0.2 mM dGTP, 1.0 mM dCTP, 1.0 mM dTTP, 10 mM 
KCI, 10 mM (NH4)S04, 20 mM Tris-HCI (pH 8.8 @ 25°C) and 
0.1% Triton X-100. Thermocycling parameters were 1 cycle for 
3 min @ 94<»C, 25 cycles (1 min @ 94°Q 1 min @ 50°C, 1 min 
@ 72°C) and 1 cycle for 7 min @ 72°C. The PGR product was 
digested with Xbal and Safl, gel-purified and cloned into pAGR3. 
The pAGRS mutagenic library was transformed into ER1992 
[pAGYG184EagIM] cells by electroporation and plated on LB- 
agar containing 100 pg/ML ampicillin, 33 pg/pL chloramphenicol 
and 80 pg/pL X-gal. Of approximately 60,000 transformants, 10 
blue colonies were isolated. The endonuclease clones present 
within these colonies were amplified by colony PGR and cloned 
into T7 expression vector pAII17. The substrate specificity of 
each clone was analyzed by overexpression within ER2744 
carrying pAGYG188-EagIM and pSYX20-HhaIM. The cellular 
extract of clone 44- 2A produced an alternate cleavage pattern 
when incubated with substrate pUGAdenoXba (previously 
linearized by Pmel). Specifically, the 16,748 bp fragment 
produced by wild-type Not! was partially digested into 7 kb and 
9 kb fragments (see Figure 5). Inspection of the substrate 
sequence determined that cleavage at 5'-GCAGGTGG-3' and/or 
5'-GGTGGAGG-3' may be responsible for production of the 
alternate restriction fragments. 
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DNA sequencing of clone 44-2A revealed nucleotide 
substitutions that result In the amino acid substitutions P9S, 
E156K and I201T. We deduced that the relaxed specificity was 
at least partially caused by the E156K substitution, as this 
variant could not be isolated in host ER2744 [pACYC-EaglM, 
pSYX20-HhaIM] (due to cell toxicity). Subsequently, the E156K 
substitution was isolated In an allele that contained an amber 
codon at position 37 as a means of attenuation. The 
Amb37/E156K allele was expressed from pAGR3 In ER1992 
[pACYC184-EagIM] and the resulting extract produced the same 
altered pattern of digestion as the original clone, 44-2A, thus 
confirming the Importance of substitution E156K for relaxation 
of substrate specificity. 

The altered substrate specificity of E156K was investigated 
in vivo by expressing the Bbvl methylase In addition to the EagI 
methylase. The Bbvl methylase modifies the C5 position of the 
first cytosine in the sequence 5'-GCAGC-3' and 5'-GCTGC-3'. 
Therefore, modification of 5'-GCAGCTGC-3' and 5'-GCTGCAGC- 
3' by Bbvl methylase is predicted to protect DNA from cleavage 
by a variant NotI (REBASE: httpr/Zrebase. neb.com V In fact, 
modification of the genomic DNA by the Bbvl methylase 
increased the tolerance of host ER2744 to Amb37/E156K 
expressed from pAGR3. When plated on LB-agar containing 
increasing levels of IPTG, loss of cell viability was detected at 80 
pM IPTG in the presence of Bbvl and EagI methylation; 
whereas, viability was lost at 25 pM IPTG when the host was 
protected by EagI methylation only. This rapid in vivo test 
serves to give an indication of the altered specificity and aids in 



wo 03/060152 



PCT/US03/00542 



■31- 



the choice of the appropriate methylase to be employed in the 
genetic selection of the present invention. 

II, Genetic SeOectaora of MotI VarOairatts [Possessoog 
5'°GCW@GW@G-3' SpecDiFDCDity 

The ''relaxed" NotI variant Amb37/E156K can be subjected 
to one or more rounds of genetic selection in order to isolate an 
endonuclease that preferentially recognizes 5'-GCWGCWGC-3' 
(where W = A orT). 

In round one, a mutagenized Amb37/E156K library is 
ligated into pAGR3 and the resulting clones transformed into 
ER2744 protected by only the Bbvl methylase. The appropriate 
amount of IPTG In selection one is determined by testing which 
level results in an adequate number of "active" survivors as 
revealed in selection two by the indicator strain ER1992. Active 
clones from selection two are transformed into ER1992 
expressing the Bbvl methylase. White or light-blue colonies in 
step three are cultured, cell extract is prepared and the 
endonuclease activity is analyzed in vitro by incubation with an 
appropriate DNA substrate. Clones displaying an even greater 
preference for 5'-GCWGCWGC-3' as compared to the parent are 
sequenced and the responsible amino acid alterations are 
determined by inference and/or site-directed mutagenesis. The 
most desired variant(s) are subjected to a second round of 
genetic selection and so forth. In each subsequent round it is 
important to increase the In vivo selection pressure imposed on 
the mutagenized gene library. In this example, the level of IPTG 
can be Incrementally increased in each round and/or codon 37 
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can be restored to CAG thus significantly increasing tlie in. vivo 
DNA-damaging effects of tiiose variants possessing activity 
outside the spectrum of Bbvl methylatlon. Finally, the desired 
NotI clone will be sequenced to determine the genetic 
alterations and the protein will be over-expressed, purified to 
near homogeneity and characterized in detail. 

Examiplle 3 

SeledtOoini olF eondoiniiLHCfleaise vairnarii{ls wDtflu am aDtoimalte 
specilFicD^ foir DMA cDeavage 

The genetic selection method of the present invention can 
be applied to engineer an existing endonuclease to recognize 
and cleave an alternate DNA sequence. As outlined in Figure lA, 
an element of the genetic selection procedure is methylation 
protection of the host genomic DNA by one or more non-cognate 
DNA methyltransferase. The imposed DNA methylation pattern 
specifically protects the desired, alternate sequence(s) while 
allowing the cognate DNA sequence(s) to be efficiently cleaved 
by the wild-type endonuclease. This critical element can be 
verified by in vivo and in vitro studies of the non-cognate DNA 
methyltransferase(s) and the wild-type endonuclease. 
Regardless of the temperature optimum of the wild-type 
endonuclease, the activity of the starting mutagenic library can 
be attenuated to maximize the efficiency of the genetic selection 
process. Options include creating an amber codon within the 
open reading frame, creating an opal codon within the open 
reading frame, changing the start codon to GTG or mutating the 
RBS sequence to decrease the translational efficiency. In 
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addition, the use of a T7 expression vector Is especially 
advantageous for achieving low-level constitutive expression of 
the endonuclease library In a DNA-damage reporter strain that 
does not carry a T7 RNA polymerase gene. For example, the 
mutagenic library for round one can be cloned into the T7 
expression vector pAII17. i^utagenesis of the endonuclease 
gene can be accomplished by error-prone PGR conditions "Al" 
as described In Example 1. After error-prone PGR amplification, 
the mutagenic library (Rgene*) is digested with the appropriate 
restriction enzymes, ligated into pAII17 and subjected to the 
following genetic selections: 

SELEGTION 1 - The mutagenic library is introduced Into 
strain 1 by electroporatlon or transformation of chemically- 
competent cells. Strain 1 (e.g., ER2744 [pAGYG-i^gene]) will 
have been pre-modified with the desired, non-cognate pattern 
of methylation. Survivors can be selected at 25*'G-42°G on LB- 
agar plates containing a low-level of IPTG (0-100 \xM), Amp and 
an antibiotic to ensure maintenance of the methylase 
plasmid(s). Surviving colonies from each plate are pooled and 
plasmid DNA is prepared from each pool. The methylase 
plasmid(s) is destroyed by digestion with multiple restriction 
enzymes. 

SELECTION 2 - The mutagenic endonuclease sub-libraries 
are transformed into DNA-damage indicator strain such as 
ER1992 and plated at 30*'C-42®G on LB-agar containing Amp 
and X-gal. Individual active clones displaying a daric blue colony 
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phenotype are cultured for 2-16 hours at 30*»C-42*>C and 
plasmid DNA is prepared form these cultures. 

SELECTION 3 - Individual clones can be transformed into 
ER1992 [pACYC-Mgene] which has been pre-modified with the 
same pattern of methylation as in strain 1/selection 1. The 
transformants can be plated at 30''C-42°C on Amp, X-gal and 
an antibiotic to maintain the methylase plasmld(s). Individual 
white colonies are cultured, induced with IPTG and cell extract is 
prepared. Alteration of endonuclease specificity can be analyzed 
by adding the cell extract to an in vitro DNA cleavage reaction 
containing a substrate which allows distinction between 
restriction of the original recognition sequence(s) and a 
preference for the desired recognition sequence(s). 
Endonuclease variants displaying the desired specificity (or 
partial alteration of specificity) are sequenced and the 
responsible mutations are determined by inference and/or site- 
directed mutagenesis. The mutations responsible for alteration 
of specificity can be combined in one endonuclease clone by 
site-directed mutagenesis. This clone, or any of the variants 
selected directly from round one, can be chosen for further 
improvement by a subsequent round of genetic selections. 
During selection 1 of round two, the IPTG level will be Increased 
as compared to round one in order to increase the selection 
pressure. This will increase the likelihood of eliminating those 
clones with only a partial alteration of substrate specificity. The 
selection process can end after round two or can proceed for 
multiple rounds until the desired endonuclease variant is 
isolated. With each subsequent round, selection 1 will be made 
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more stringent, preferably by increasing the level of IPTG 
present In the agar plates. Finally, the desired endonuclease 
clone will be sequenced to determine the genetic alterations and 
the protein will be over-expressed, purified to near homogeneity 
and characterized in detail. 
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What is claimed is: 

1. A method for altering an endonuclease recognition site 
specificity, comprising: 

(a) subjecting a mutagenized endonuclease gene library to a 
genetic selection In a population of prokaryotic host cells 
expressing one or more non-cognate DNA methyltransf erases, 
wherein the genetic selection selects for viable cells in the 
population; and 

(b) Identifying whether the viable cells express an active 
mutated endonuclease with an altered recognition site 
specificity. 

2. A method according to claim 1, wherein the mutagenized 
endonuclease gene library is formed by: error prone PGR, chemical 
mutagenesis, assembly PGR, DNA shuffling, in vivo mutagenesis, 
cassette mutagenesis, recursive ensemble mutagenesis or 
exponential ensemble mutagenesis. 

3. A method according to claim 1, wherein the endonuclease 
activity is attenuated. 

4. A method according to claim 3, wherein attenuating the 
activity of the endonuclease expressed by the mutagenized 
endonuclease gene library is achieved by modifying the mutagenized 
endonuclease gene library using modification means selected from: 
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creating an amber codon within the open reading frame; creating an 
opal codon within the open reading frame; changing the start codon 
to GTG or TTG; mutating the RBS sequence or utilizing a T7 
expression vector wherein the host ceil is T7 RNA polymerase 
negative. 

5. A method according to claim 1, wherein step (b) comprises: 
pooling viable prokaryotic host cells; Isolating from the host cells, 
plasmid DNA encoding mutagenized endonuclease genes from the 
library; and transforming the plasmid DNA into a population of 
indicator cells for detecting DNA damage. 

6. A method according to claim 5, wherein the mutagenized 
endonuclease genes are subjected to repeated genetic selections in 
the population of host cells of claim 1 and in the population of 
indicator cells. 

7. A method according to claim 6, wherein the genetic selection 
in the population of indicator cells comprises a first population of 
indicator cells lacking a non-cognate methylase(s) and a second 
population of indicator cells expressing the non-cognate 
methylase(s). 

8. A method according to claim 1, wherein altered recognition- 
site specificity comprises: relaxed recognition-site specificity, 
increased recognition-site specificity or alternate recognition-site 
specificity. 
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9. A method according to claim 1, furtlier comprising determining 
a sequence for tlie recognition site for tlie endonuclease. 

10. A method for altering recognition site specificity of an 
endonuclease, comprising: 

(a) creating a mutagenlzed endonuclease gene expression 
plasmid library from a target endonuclease gene and 
transforming prokaryotic cells with the mutagenlzed library, 
wherein the prokaryotic cells express one or more non- 
cognate methyltransferases; 

(b) selecting prokaryotic cells which are viable after 
transformation and isolating plasmid DNA from the viable 
cells; 

(c) determining whether the isolated plasmid DNA encodes 
an active endonuclease by transforming the plasmid DNA into 
DNA-damage indicator cells; 

(d) screening the plasmid DNA encoding the active 
endonuclease for altered specificity; and 

(e) optionally repeating steps (a) through (d) to obtain the 
endonuclease with altered recognition-site specificity. 



11. A method according to claim 10, further comprising: 
determining the altered recognition site for the endonuclease. 
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12. A method according to claims 1 or 10 wlierein the 
endonuclease is BstYI. 

13. A method according to claims 1 or 10 wherein the 
endonuclease is Notl. 

14. A method according to claim 12, wherein the recognition site 
specificity is altered from 5'-RGATCY-3' to 5'-AGATCT-3'. 

15. A endonuclease having an altered recognition site specificity 
wherein the specificity is altered according to claim 1 or claim 10. 

16. A modified BstYI enzyme, having a preferred recognition site 
specificity of 5'-AGATCT-3'. 

17. A method for modifying recognition site specificity of an 
endonuclease from a parent specificity to a target specificity, 
comprising: 

(a) obtaining a sequence for a plurality of mutated 
endonucleases obtained according to any of the methods of claims 1 
or 10 to determine the mutation(s) for each mutated endonuclease; 
and 

(b) mutating a gene encoding the endonuclease to produce 
one or more of the mutations identified in step (a) so as to produce 
an endonuclease with the target specificity. 
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Mutagenize Rgene by error-prone PGR 
Ligate Rgene* to expression vector 

I 

Transform into ER2502 [pMgene] 
Select survivors at 25**C-42°C on LB agar + Amp, Cam, IPTG 
Make a plasmid library of pRgene* and pMgene 

; 

Eliminate pMgene by digestion, but leave pRgene* intact 
Transform ER1992 dinD::kicZ indicator strain with pRgene* library 
Select individual dark blue colonies at 30''C-42''C on LB . agar + X-gal, Amp (No HTG) 
Prepare plasmid DNA from individual dark blue colonies 
Transform pRgene* clones into ER1992 [pMgene] 

I 

Select white colonies at 30°C-42°C on LB agar + X-gal, Amp, Cam 

I 

Screen white colonies 
The white colonies may contain Rgene* variants with desired specificity 



1^ 
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Genetic Selection of JJs^YI Variants witti 5'-AGATCT-3' specificity 



Mutagenize bstYIR graie by error-prone PGR 
Ligate bstYIR* to pAGR3 (Ptac) 

; 

Transform into ER2502 [pSYX-BgUIM] 

I 

Select survivors at 37°C on LB agar + Anq), Tet, 0.3 mM IPTG 

I 

Make a plasmid library of pAGR3-BstYIR* and pSYX-BgUIM 
EHminate pSYX-BglllM by digestion, but leave pAGR3-BstYIR* intact 
Transform ER1992 dmD::lacZ indicator strain with pAGR3-BstYIR* library 
Select individual dark blue colonies at 37 C on LB agar + X-gal, Amp (No IPTG) 

I 

Prepare plasmid DNA from individual dark blue colonies 
Transform pAGR3-BstYIR* clones into ER1992 [pSYX-BglllM] 
Select white colonies at 37°C on LB agar + X-gal, Amp, Tet 

I 

Screen white colonies 
The white colonies may contain BstYI* mutants with Bgm specificity 
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ATGAGAATTGTTGAAGTATATTCGCATTTGAACGGGTTGGAATACATACAAGTTCACT^ 
1 + + + + + + 60 

MRIVEVYSHIiNGIiEYIQVHL 
CCACATATTTGGGiUVGAAATa»CAAGAAAT'rATTGTT'lx:TATTGA^ 

61 _+ + + + + + 120 

PHIWEEI QEIIVSIDAEACR 
ACGAAGGAATCAAAAGAAAAGACAAAACAAGGACAAATACTTTATAGTCCCGTAGCTTTA 

121 + ' + + + + + 180 

TKESKEKTKQGQILYSPVAL 
AATGAAGCATTCAAGGAAAAATTAGAAGCAAAAGGTTGGAAAGAAAGTCGAACAAACTAT 

181 + + + + + + 240 

NEAFKEKIiEAKGWKESRTNY 
TATGTGACTGCTGACCCAAAGCTGATTCGTGA/VACATTATCACTTGAACCAGAGGAACi^ 

241 + + + + + + 300 

YVTADPKIiIRETLSIiEPEEQ 
AAGAAAGTGATTGAAGCCGCAGGAAAAGAAGCATTAAAGTCTTATAAT^ 

301 + + + + + + 360 

KKVIEAAGKEALKSYNQTDF 
GTAAAAGATAGAGTGGCAATAGAAGTTCAATTCGGAAAATATTCTTTTGTCGC^ 

361 + + + + + + 420 

VKDRVAIEVQFGKYSFVAYD 
CTTTTCGTCAAACACATGGCTTTCTATGTTAGTGATAAAAO^ACGT^ 

421 + + + + + + 480 

IiFVKHMAFYVSDKIDVGVEI 
TTGCCAATGAAGGAATTATCAAAAGAAATGTCTTCGGGAATCAGTT 

481 + + + + + + 540 

LPMKELSKEMSSGISYYEGE 
TTATACAATGTGATACGGCAAGGTCGTGGCGTTCCTGCCGTTCCGTTGGTTTTAAa?CGGG 

541 + + + + + 600 

LYNVIRQGRGVPAVPIiVLIG 
ATTGCCCCTTAA 

601 ' +~ 612 

I A P * 
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5' AGATCT 3' 

3* TCTAGA 5' 



time: 
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FIGURE 4B 
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NN1 
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5' GGATCC 3' 

3' CGTAGG 5' 
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3' TCTAGG 5' 
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<110> NEW ENGLAND BIOLABS, INC. 
SAMUELSON, JAMES 
XU, SHUANG-YONG 

<120> ALTERATION OF RESTRICTION ENDONUCLEASE SPECIFICITY BY GENETIC SELECTION 

<130> NBB-198-PCT 

<150> US 60/347,403 
<151> 2002-01-10 



<160> 
<170> 



Patentin version 3.1 



<210> 
<211> 
<212> 
<213> 

<220> 
<221> 
<222> 
<223> 



1 

612 
DNA 

Bacillus stearothermophilus Y406 



CDS 

(1) . . (612) 



<400> 1 

atg aga att gtt gaa gta tat teg cat ttg aac ggg ttg gaa tac ata 48 
Met Arg lie Val Glu Val Tyr Ser His Leu Asn Gly Leu Glu Tyr He 
15 10 15 

caa gtt cac ttg cca cat att tgg gaa gaa att caa gaa att att gtt 96 
Gin Val His Leu Pro His He Trp Glu Glu He Gin Glu He He Val 
20 25 30 

tct att gac gca gaa get tgt aga acg aag gaa tea aaa gaa aag aea 144 
Ser He Asp Ala Glu Ala Cys Arg Thr Lys Glu Ser Lys Glu Lys Thr 
35 40 45 

aaa caa gga caa ata ctt tat agt ccc gta get tta aat gaa gca ttc 192 
Lys Gin Gly Gin He Leu Tyr Ser Pro Val Ala Leu Asn Glu Ala Phe 
50 55 60 

aag gaa aaa tta gaa gca aaa ggt tgg aaa gaa agt ega aea aac tat 240 
Lys Glu Lys Leu Glu Ala Lys Gly Trp Lys Glu Ser Arg Thr Asn Tyr 
65 70 75 80 

tat gtg act get gac cca aag ctg att cgt gaa aca tta tea ctt gaa 288 
Tyr Val Thr Ala Asp Pro Lys Leu He Arg Glu Thr Leu Ser Leu Glu 
85 90 95 

cca gag gaa caa aag aaa gtg att gaa gee gca gga aaa gaa gca tta 336 
Pro Glu Glu Gin Lys Lys Val He Glu Ala Ala Gly Lys Glu Ala Leu 
100 105 110 

aag tct tat aat caa acg gat ttt gta aaa gat aga gtg gca ata gaa 384 
Lys Ser Tyr Asn Gin Thr Asp Phe Val Lys Asp Arg Val Ala He Glu 
115 120 125 

gtt caa ttc gga aaa tat tct ttt gte get tat gac ctt ttc gtc aaa 432 
Val Gin Phe Gly Lys Tyr Ser Phe Val Ala oyr Asp Leu Phe Val Lys 
130 135 140 

cac atg get ttc tat gtt agt gat aaa att gac gtt ggt gte gaa ata 480 
His Met Ala Phe Tyr Val Ser Asp Lys He Asp Val Gly Val Glu He 
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145 150 ' 155 160 

ttg cca atg aag gaa tta tea aaa gaa atg tct teg gga ate agt tat 528 
Leu Pro Met Lys Glu Leu Ser Lys Glu Met Ser Ser Gly He Ser Tyr 
165 170 175 

tac gaa ggt gaa tta tac aat gtg ata egg eaa ggt egt ggc gtt cet 576 
Tyr Glu Gly Glu Leu Tyr Asn Val He Arg Gin Gly Arg Gly Val Pro 
180 185 190 

gee gtt ccg ttg gtt tta ate ggg att gcc cct taa 612 
Ala Val Pro Leu Val Leu He Gly He Ala Pro 
195 200 

<210> 2 
<211> 203 
<212> PRT 

<213> Bacillus stearothermophilus Y406 
<400> 2 

Met Arg He Val Glu Val Tyr Ser His Leu Asn Gly Leu Glu Tyr He 
15 10 15 

Gin Val His Leu Pro His He Trp Glu Glu He Gin Glu He He Val 
20 25 30 

Ser He Asp Ala Glu Ala Cys Arg Thr Lys Glu Ser Lys Glu Lys Thr 
35 40 45 

Lys Gin Gly Gin He Leu Tyr Ser Pro Val Ala Leu Asn Glu Ala Phe 
50 55 60 

Lys Glu Lys Leu Glu Ala Lys Gly Trp Lys Glu Ser Arg Thr Asn Tyr 
65 70 75 80 

Tyr Val Thr Ala Asp Pro Lys Leu He Arg Glu Thr Leu Ser Leu Glu 
85 90 95 

Pro Glu Glu Gin Lys Lys Val He Glu Ala Ala Gly Lys Glu Ala Leu 
100 105 110 

Lys Ser Tyr Asn Gin Thr Asp Phe Val Lys Asp Arg Val Ala He Glu 
115 120 125 

Val Gin Phe Gly Lys Ty^^ Ser Phe Val Ala Tyr Asp Leu Phe Val Lys 
130 135 140 

His Met Ala Phe Tyr Val Ser Asp Lys He Asp Val Gly Val Glu He 
145 150 155 160 

Leu pro Met Lys Glu Leu Ser Lys Glu Met Ser Ser Gly He Ser Tyr 
165 170 175 

Tyr Glu Gly Glu Leu Tyr Asn Val He Arg Gin Gly Arg Gly Val Pro 
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w 

180 '185 190 

Ala Val Pro Leu Val Leu He Gly He Ala Pro 
195 200 
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